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Abstract. We introduce a derivative-free computational framework for approximat¬ 
ing solutions to nonlinear PDE-constrained inverse problems. The general aim is 
to merge ideas from iterative regularization with ensemble Kalman methods from 
Bayesian inference to develop a derivative-free stable method easy to implement in 
applications where the PDE (forward) model is only accessible as a black box (e.g. 
with commercial software). The proposed regularizing ensemble Kalman method can 
be derived as an approximation of the regularizing Levenberg-Marquardt (LM) scheme 
m in which the derivative of the forward operator and its adjoint are replaced with 
empirical covariances from an ensemble of elements from the admissible space of solu¬ 
tions. The resulting ensemble method consists of an update formula that is applied to 
each ensemble member and that has a regularization parameter selected in a similar 
fashion to the one in the LM scheme. Moreover, an early termination of the scheme is 
proposed according to a discrepancy principle-type of criterion. The proposed method 
can be also viewed as a regularizing version of standard Kalman approaches which are 
often unstable unless ad-hoc fixes, such as covariance localization, are implemented. 

The aim of this paper is to provide a detailed numerical investigation of the 
regularizing and convergence properties of the proposed regularizing ensemble Kalman 
scheme; the proof of these properties is an open problem. By means of numerical 
experiments, we investigate the conditions under which the proposed method inherits 
the regularizing properties of the LM scheme of m and is thus stable and suitable 
for its application in problems where the computation of the Frechet derivative is 
not computationally feasible. More concretely, we study the effect of ensemble size, 
number of measurements, selection of initial ensemble and tunable parameters on the 
performance of the method. The numerical investigation is carried out with synthetic 
experiments on two model inverse problems: (i) identification of conductivity on a 
Darcy flow model and (ii) electrical impedance tomography with the complete electrode 
model. We further demonstrate the potential application of the method in solving 
shape identification problems that arises from the aforementioned forward models by 
means of a level-set approach for the parameterization of unknown geometries. 

Submitted to: Inverse Problems 

1 . Introduction 

We propose a computational derivative-free regularization method for approximating 
solutions to nonlinear PDE-constrained inverse problems. More precisely, the aim of the 
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method is to identify parameters in PDE models given data/observations of the solution 
of the PDE. Inverse problems of this kind are ill-posed in the sense of stability; small 
perturbations of the data may have uncontrolled effects on the approximation of the 
unknown parameters in the PDE. Therefore, the computation of stable approximations 
of solutions to these inverse problems requires regularization. Classical regularization 
methods (e.g. Tikhonov regularization) na reformulate the inverse problem so that the 
regularized version can be solved with, for example, standard optimization methods. 
In contrast, iterative regularization approaches regularize while computing a stable 
approximation to the inverse problem [22]. The aim of these methods is to compute an 
estimate/approximation controlled by the noise level. As the noise in the data goes to 
zero, this approximation converges to a solution of the identification problem. 

1.1. Contribution of this work 

Most existing iterative regularization methods [22| require the implementation of the 
Frechet derivative of the forward map as well as the corresponding adjoint operator. 
In various applications, however, standard software for forward simulation does not 
provide numerical approximations of the Frechet derivative and/or its associated adjoint. 
In some cases, even if the linearization of the forward map is computed within the 
forward simulation (e.g. in a Newton-type solver for nonlinear PDFs) this may not be 
accessible in a modular fashion suitable for an iterative regularization framework. In 
this paper we present a derivative-free ensemble Kalman-based iterative regularization 
technique for the approximation of PDF-constrained identification problems. While 
standard ensemble Kalman methods are aimed at approximating an inverse problem 
posed in a Bayesian inference framework, the objective of the present work is to 
merge ideas from iterative regularization with ensemble Kalman methods to develop 
a computational framework with the regularization properties needed to solve classical 
(deterministic) identification problems. The proposed framework offers the flexibility 
of typical implementations of ensemble Kalman methods often used for large-scale data 
assimilation. In particular, it uses the forward map in a black-box fashion which makes 
it easy to implement and thus ideal for applications where modeling and simulation are 
performed with complex computer codes. 

The proposed method can be derived as an approximation of the regularizing 
Levenberg-Marquard (LM) scheme developed by Hanke in [Tl|. More specifically, we 
construct an iterative ensemble method from the regularizing LM scheme by replacing 
operators involving the Frechet derivative of the forward map and its adjoint by empirical 
covariances from an ensemble of elements from the parameter space. Members from this 
ensemble are iteratively updated according to an expression that resembles the Kalman 
update from standard ensemble Kalman filter/smoother methods [I3]. However, in 
contrast to those standard methods, we propose an update formula with a regularization 
parameter whose selection is made similar to the one of the regularization parameter 
in the LM scheme na but with the aforementioned ensemble approximations of the 
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forward map and its adjoint. Similarly, we propose an early termination of the scheme 
motivated by the discrepancy principle used in iterative regularization methods. The 
proposed ensemble Kalman regularizing scheme can then be regarded as (i) a derivative- 
free approximation of the regularizing LM scheme and/or (ii) a regularizing version of 
standard iterative Kalman methods |16j . 

An iterative regularizing ensemble Kalman method of the type presented here 
has been recently introduced in [12] for the solution of Bayesian inverse problems 
in reservoir modeling applications. The aim of [I9| was to show that importing 
ideas from iterative regularization may improve the performance of Kalman methods 
for approximating the Bayesian posterior. The classical approach to the inverse 
problem is now pursued in the present work. More concretely, we wish to assess 
the convergence and regularizing properties of ensemble Kalman methods derived from 
iterative regularization techniques. In contrast to [19] where the framework was Bayesian 
and focused on reservoir applications, our objective here is to study the performance 
of regularizing ensemble methods for solving classical identification problems in generic 
PDE-constrained applications. 

The contribution of the present work is threefold. First, we show that the proposed 
method addresses the ill-conditioning typically exhibited by existing implementations 
of ensemble methods. We show that importing ideas from iterative regularization such 
as the discrepancy principle can offer stability that, in existing implementations are 
often treated with ad-hoc methodologies such as covariance localization and covariance 
inflation. The second contribution of this paper is to showcase the potential application 
of the proposed methods for addressing a wide range of parameter identification 
problems. We show that, with reasonable computational cost, the proposed method 
can be not only computationally advantageous but also robust and accurate. We 
consider two model inverse problems: (i) identification of hydraulic head in a Darcy flow 
model and (ii) electrical impedance tomography (FIT) with a complete electrode model 
(CFM). In addition, we display the capabilities of the proposed method to solve shape 
identification problems where the computation of the shape derivative of the forward 
map may be cumbersome. In concrete, we combine the proposed ensemble method 
with the level-set approach of im to estimate shapes whose boundaries determine 
regions of sharp discontinuities of parameters in the PDF models under consideration. 
Since our methodology does not require derivatives, our level-set based formulation is 
considerably more simple than standard approaches where shape derivatives are needed 
[SlElEo]. Moreover, the proposed ensemble level-set approach does not use the level-set 
equation as in standard formulations; our iterative scheme induces a stable evolution 
of the unknown interface. In addition, by initializing the ensemble according to the 
ideas recently proposed in ini we avoid computational issues such as the flattening 
of the level-set function which is often observed with standard methods and typically 
addressed with ad-hoc techniques. The third contribution of the present work is to 
provide an extensive numerical investigation in order to assess the convergence and 
regularization properties of the proposed ensemble scheme with respect to ensemble 
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size, number of measurements, initial ensemble and tunable parameters. Although 
the convergence theory of the proposed regularizing ensemble Kalman method is an 
open problem, our numerical study offers deep insight into the potential application of 
iterative regularization for the development of derivative-free ensemble methods thereby 
opening up a whole new field for application and theory. 

The rest of the manuscript is organized as follows. In Section we introduce 
the general framework for the identification problem. The proposed ensemble Kalman 
method is introduced in subsection 12.21 The test models used for the validation of 
the proposed scheme are introduced in subsection 2.3[ Preliminary numerical examples 
that show the regularizing properties of the method are presented in subsection 2.4 In 


Section we discuss general aspects and properties of the proposed scheme including 
the derivation of the method as an ensemble approximation of the regularizing LM 
scheme of pj. In Section]^ we then conduct an extensive numerical investigation of the 
proposed scheme. In particular, we study the relation between the ensemble size and the 
number of measurements in terms of their effect on the regularization and convergence 
properties of the scheme. We investigate the effect of the tunable parameters of the 
scheme and the regularization properties with respect to the ensemble size in the small 
noise limit. In Section we discuss some potential applications of the scheme for the 
solution of geometrical inverse problems. Conclusions and future research are provided 
in Section 


2. PDE-constrained inverse problems 

2.1. General framework 

Let us denote hy G : X ^ Y the (nonlinear) forward operator that arises from the 
PDE-constrained model under consideration. In other words, G maps the space X of 
PDE parameters (e.g. coefficients, source terms and/or boundary conditions) to the 
observation space Y defined in terms of observable quantities related to the solution 
of the PDE (e.g. pointwise measurements of the solution). We assume that X and Y 
separable Hilbert spaces with norms denoted by || ■ ||x and || ■ ||y respectively. For the 
applications under consideration, the dimension of the observation space is often small 
(i.e. order of 10^ - 10“^). Therefore, we consider the case where Y is finite dimensional. 
However, upon discretization, the dimensions of X could be large (i.e. greater than 10®). 
Therefore, in order to derive algorithms robust under grid refinement we consider the 
case where dim(X) = oo] our aim is thus to keep our formulation within a functional 
analytical framework. 

The PDE-constrained model has an unknown parameter E X that we wish to 
identify from noisy measurements eY defined by 

= ( 1 ) 

where G K is noise which for the purpose of this exposition will be considered 
deterministic. In addition to we assume we have knowledge of the (weighted) noise 
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level rj defined by 

r,= ||r-‘''V-G(«'))llK (2) 

where F : Y —y is a self-adjoint positive-definite operator which for the present 
analysis can be understood as a weighting/scaling operator that enable us, for example, 
to include information concerning the precision of our measurement device. 

The identihcation problem is the following: Given y'^, find u & X such that 
G{u) = y^. Since the data may not be in the range of the forward operator we may 
alternatively formulate the identification as the minimizer of 

4(«) = ||r-'/2(,,i-G(t.))||y (3) 

However, forward operators G that arise from PDE models are typically compact 
and weakly sequentially closed m- From this property it follows that the inverse 
problem is ill-posed in the sense of stability. In other words, we may hnd such 
that G{un) —)■ y^ but ^ Therefore the computation of a minimizer of (|^ with 
standard (unregularized) iterative minimization approaches may be unstable. This lack 
of stability can be alleviated by means of iterative regularization methods. As discussed 
in Section iterative regularization provides a computational framework to compute 
stable approximation to the inverse problem. In concrete, their aim is to compute an 
estimate/approximation controlled by the noise level rj, that converges, in the small 
noise limit, to a solution of the identihcation problem, i.e. —)■ m as r; —)• 0, where 

the limit u satishes G{u) = G{u^). Most iterative regularization approaches require the 
computation of the Frechet derivative of the forward operator G. In some applications 
where commercial software is used for forward simulations Frechet derivatives may not 
available and so the application of those iterative methods may be limited. The objective 
of the present work is to introduce a derivative-free regularization ensemble Kalman 
method for the stable computations of the identihcation problem. 


2.2. The regularizing ensemble Kalman method 


Let us assume that we are given an ensemble of elements Uq'^ {j G {1,..., A),}) of 

the parameter space X. One may think of as an ensemble of potential initial 

guesses for any iterative regularization method applied for the stable identihcation of the 
unknown parameter ufi We will discuss the selection of such initial ensemble 
in subsection 13.11 

We now propose an iterative scheme where, at every iteration level n, each ensemble 
member Un^ is updated in such a way that the corresponding ensemble mean 


Urt. — 


1 


Ne 

Y 

i=i 


u 


(i) 


( 4 ) 


approximates the inverse problem in the small noise limit as described in subsection 2.1 


In other words, we need that algorithm stops at hnite iteration level n* producing an 
estimate = m* such that —)■ m as r; —)■ 0, where G{u) = G{u^) . The proposed 
regularizing ensemble Kalman method is presented below. 
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Algorithm 1 Iterative regularizing ensemble Kalman method 

Let C X be the initial ensemble of elements. Let p G (0,1) and r > 1/p. 

For n = 0,1,... 

(1) Prediction step. Evaluate 

(5) 

and define Wn = 

(2) Discrepancy principle. If 

-Wn)\\Y < rp ( 6 ) 

stop. Output Un = ^ 


(3) Analysis step. Define Cffi’, by 

Ne 


in 


Cr(-) = - ®»)(G(«®) - Wn), ■) 




iVe-1 


Y 


i=i 

Afe 


- Un){G{u^^'^) - Wn), ■)y- 


( 7 ) 

( 8 ) 


i=i 


Update each ensemble member: 

+ cr(cr + a„r)-‘(!,-> - wU'i), J e {1,...,ivj 

where is chosen by the following seguence 
<+' = 2*0°. 

where is an initial guess. We then define a.. = where N is the first integer 
such that 


( 9 ) 

( 10 ) 


- ®„)|| > p||r-'''=(s’’ - ®„)|| 

Note that, in contrast to other regularization techniques, the discrepancy principle 


is applied to Wn which (see expression (26)) is the average of the model output 


G{u^^^) of each ensemble member. While this quantity approximates G{un) to hrst order 


see subsection 3.3), it is clearly not the data misht of the proposed estimate. 


The proof of convergence of Algorithm [T] to a stable solution of the inverse problem 
is beyond the scope of the present manuscript; our aim is to offer numerical evidence of 
the convergence and regularization properties of the scheme. 


2.3. Test Models 

The proposed ensemble Kalman method will be tested on two PDE-constrained inverse 
problems. We consider small test inverse problems where we have the computational 
flexibility to conduct a large amount of numerical experiments in order to understand 
the effect that the tunable parameters, ensemble size, selection of initial ensemble and 
number of measurements have on the regularization properties and accuracy of the 
proposed method. We introduce the test models under consideration below. 
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2.3.1. Test model I. Darcy flow. We consider single-phase steady-state Darcy flow in a 
two-dimensional conflned aqnifer whose physical domain is D = [0, 6] x [0, 6]. k denotes 
the hydranlic condnctivity. The flow is described in term of the piezometric head h{x) 
{x E D) given by the solntion to [3] 

— V ■ nWh = f in D (11) 


where / is the sonrce which for the present work is defined by 

0 if 0 < xa < 4, 


f{Xi,X2) = 


137 if 4<X2<5, 
274 if 5<X2<6. 


The following bonndary conditions are considered 


dh 

h(a;,0) = 100, —(6,?/) = 0, 


dh dh 

- k |)( o ,») = 5 oo , ‘n^x,e)^o, 


( 12 ) 


(13) 


We are interested in recovering the logarithm of the hydranlic condnctivity u = log/?, 
from noisy pointwise measnrements of the piezometric head h. In other words, we 
consider 


G{u) = {h{xi ),..., h{xM)) 


(14) 


where h is the solntion to (11)-(13) and are the measnrement locations. This 

gronndwater model was used first used as benchmark for inverse modeling in [^. It 
has been also used as a test model for the identification of parameters with iterative 
regularization methods in [lH [15] and with an ensemble Kalman approach in [16] . 


2.3.2. Test model II. Complete Electrode Model. Our second test model is based on 
the Complete Electrode Model (CEM) for Electrical Impedance Tomography (EIT). 
The objective of EIT is to identify the conductivity k of a body D given measurements 
of voltages from a configuration of me electrodes on {ek}^li places on the boundary 
dD. The measured voltage arise from current patterns applied on those electrodes. The 
forward model associated to EIT is the CEM which consist of computing the voltage v 
in D and the voltages {14}^^ on that satisfy 


V ■ nVv 

= 0 

in D, 

(15) 

V -1- Xfc/tVn ■ n 

= W 

on Ck, k = 1,.. .,me, 

(16) 

Vv ■ n 

= 0 

on 

(17) 

/ kVu ■ n ds 

lek 

= 4 

k = 1,. . . ,me, 

(18) 


where are the currents injected through the electrodes, are the contact 

impedances of the electrodes and u = log n where k is the conductivity. Well posedness 
of the CEM model requires conservation of charge 

me 

i;4=o 

k=l 
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Given n and for each current patter I = there exists a unique solution 

[' 1^5 {14}^;^] to the GEM [3l]. The EIT problem consists now of Ending k and 
from a set of Up measurements of voltages Vi = {Vi^k}'k=i) • • • > obtained 

from Up current paters Ji = ■ ■ ■ In^ = For simplicity we assume 

that the contact impedances of the electrodes are known. Therefore, the identification 
problem is to find k given 

G(«) = [{nnK,.....{t4„t}E.] 

For a review of the EIT problem we refer the reader to |1]. 


2.4- Regularizing properties of the ensemble Kalman method. 


Although ensemble Kalman methods have been typically used in the statistical Bayesian 
framework, several publications uni ED have explored the use of these approaches 
for solving (deterministic) inverse problems such as the parameter identification PDF- 
constrained problems described earlier. However, most of these approaches are based 
on update formulas of the form of (|^ but with a fixed parameter = 1, i.e. 

+ CTiCT + (19) 


As we discuss in subsection |^, the standard choice = 1 is motivated by the 
application of Kalman methods for solving Bayesian inference problems when the model 
G is linear, and the underlying prior distribution is Gaussian [33] . For nonlinear forward 
models, however, the same choice of leads to instabilities that are often fixed with 
ad-hoc methods such as covariance localization. The main objective of the present work 
is to show numerically that such instabilities can be addressed by the proposed ensemble 
Kalman method when the ensemble size is sufficiently large. 

In this subsection we briefly present numerical evidence of the regularization 
properties of the proposed scheme; a detailed numerical investigation will be presented 
in Section We apply Algorithm to the identification of the “true” log hydraulic 
conductivity displayed in Figure (top). We use use synthetic measurements of 
hydraulic head from the Darcy model of subsection 2.3.1 with the measurement locations 
displayed in Figure]^ (left). Both the truth and the elements of the initial ensemble are 
generated from a Gaussian distribution. For details on the generation of synthetic data 
(avoiding inverse crimes) and the generation of initial ensemble, we refer the reader to 
subsection 4.1 In the middle row (resp. bottom row) of Figure[^we display the estimate 
obtained from the ensemble mean at different iterations computed with the standard 
unregularized approach (resp. the proposed regularized method). For the regularized 
method we select p = 0.7 and r = 1/p. For the unregularized method we simply apply 
(19) with no stopping criterion. Both methods are applied with the same (fixed) initial 
ensemble of W = 150 members generated as described in subsection |4.1[ Figure [^ 
(left) shows the data misfit ([^ for the ensemble mean computed with the standard 
unregularized approach (solid red line) and our regularized method (dotted-black line). 
Additionally, in Figure]^ (left) we also display (the dotted-blue line ) the data misfit with 
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respect to the averaged ensemble data predictions (expression (|^) which we monitor 
for the termination of Algorithm 

In Figure]^ (middle) we show the relative L^-error with respect to the truth for the 


ensemble mean Ur 


/t| 


1 ^ 1 .e. \\Un—u' ||i 2 /||M' 11 ^ 2 ), computed with the standard unregularized 
approach (solid red line) and our regularized method (dotted-blue line). For the standard 
method we see that a fast decrease of the data misfit and relative error are observed at 
the early iterations. While the data misfit keeps decreasing, the error starts to increase 
when apparently the data misfit drops below the noise level whose value is indicated 
with the horizontal line in Figure]^ (left). This increase in the error comes as no surprise 
since the corresponding estimates overfit the data. Iterative methods applied for the 
solution of ill-posed inverse problems often display such behavior [la [23]. In fact, a 
similar semiconvergent behavior was reported in the ensemble Kalman method of (TB] 
and inspired the present work where not only a regularization of the estimates need to be 
introduced (here by means of the parameter a„) but also the proper early termination of 
the scheme that avoids fitting the noise. Indeed, we note that the regularizing algorithm 
(see Figure [^, both relative error and data misfit decreases very slowly. Once the 
(averaged) data misfit (of expression (|^) is close to the noise level, the algorithm is 
stopped producing a considerably more accurate estimate of the truth than the one 
obtained with the unregularized algorithm which blows up the update at the early 
iterations. However, if the proposed regularized algorithm is not stopped, the error with 
respect to the truth could potentially increase as we observe in subsequent experiments 
(see Figure]^. From the log-conductivity estimates ( Figure [^middle and bottom rows), 
we see that the unregularized method produces uncontrolled estimates of the unknown 
while the proposed method results in small incremental transitions which eventually 
captures the truth more accurately. As we will discuss in Section the ability of the 
proposed method for regularizing the computations of the inverse problem depends on 
the size of the ensemble Ng. 

The selection of according to (34) is crucial for the regularizing of the proposed 
method. We monitor numerically these values of obtained from the application of 
Algorithm to the identification problem described in the preceding paragraph. We 
apply the method for several choices of the parameter p (p = 0.4, 0.5,0.6, 0.7, 0.8). 
In Figure (right) we display the values of that we obtained from the proposed 
selection (34) as the number of iterations increases. At the early iterations is large 
hence controlling the updates of the approximation. As the iteration progresses and 
the data misfit decreases, also decreases. For larger p the decay of is slower; this 
provides more regularization to the expense of a more costly algorithm (see Remark 3.1). 
From Figure]^ (right) we can observe that when the data misfit has dropped to a value 
close to the noise level by which the algorithm will be terminated according to (39), 
an has dropped down to a value close to «„ = 1 which is, in turn, the intrinsic value 
for the standard Kalman method. Interestingly, there is indeed an apparent intrinsic 
parameter = 1 once the scheme arrives at the optimal approximation of the inverse 
problem. However, choosing this parameter at early iterations can be substantially 
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detrimental to the performance of the scheme. In the following section, the selection of 
ttn is derived by using ensemble approximations of the Frechet derivative of the forward 
map in the selection of the parameter that has been proven to provide stability in the 
regularizing LM scheme of Hanke [H]. As stated earlier, for the present methodology, 
such a theory is beyond the scope of the present work. 


Iteration: 1 Error: 0.88 


Iteration: 2 Error: 0.91 


True log(K) 



Iteration: 3 Error: 0.90 


Iteration: 10 Error: 0.96 


Iteration: 15 Error: 0.98 



Iteration: 5 Error: 0.96 


Iteration: 7 Error: 0.84 


Iteration: 9 Error: 0.74 


Iteration: 11 Error: 0.65 


Iteration: 13 Error: 0.61 







'M 





Figure 1. Top: true log-conductivity. Middle row: ensemble mean at some iterations of the 
standard (unregularized) ensemble Kalman method. Bottom row: ensemble mean at some 
iterations of the regularizing ensemble Kalman method. 


3. Properties and computational aspects of the regularizing ensemble 
Kalman method 


In this subsection we discuss some properties and general computational aspects of the 


proposed regularizing ensemble Kalman method presented in subsection 2.2 


3.1. The initial ensemble 


One of the main properties of the proposed ensemble Kalman method is that the estimate 
Un obtained from Algorithm [T] lives in the subspace generated by the initial ensemble 


Proposition 3.1 (Invariance Subspace Property) At every iteration of the scheme, 


the ensemble mean Un defined by (f) satisfies Un G spanlu^'^Yj^^ 
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Figure 2. Log-data misfit (left) and relative error w.r.t truth (middle) of the ensemble mean 
at some iterations obtained with the unregularized ensemble Kalman method (solid red line) 
and with the proposed regularizing scheme (dotted blue line). Right: log(a„) as a function of 
the number of iterations of Algorithm from 40 experiments with different initial ensembles 
of size Nf, = 150. 


Proof: Once we rewrite Algorithm in the augmented version of subsection 3^, the 
proof follows directly with the same argument as [161 Theorem 2.1] □. 

From the invariance subspace property it follows that the selection of the initial 
ensemble is a design parameter crucial to the performance of the proposed scheme; we 
study this numerically in subsection 4.2 Prior knowledge of the space of admissible 


solutions X can be used for such selection. For example, can be a truncated 

basis for X. Another example of prior knowledge that we may use for the construction 
of the initial ensemble is the regularity of the elements in X. More concretely, we may 
construct an ensemble by drawing its members from some probability distribution with 
the desired regularity. While the problem under consideration is deterministic, the use of 
a probability distribution is made for the sake of the generation of the initial ensemble 
with the regularity of the parameter space. The aforementioned invariance subspace 
property then ensures that the estimate produced by the proposed method inherits 
the regularity of the space of admissible solutions. Clearly, Gaussian distributions are 
desirable since sampling from them is relatively easy. Nonetheless, other priors such 
as Besov [25] could also be considered. In subsections 4.2 we provide examples where 
probability distributions are considered in order to generate an initial ensemble that we 


use with our computational approach for the FIT problem described in subsection 2.3.2 


In general, the construction of the prior ensemble is based on prior knowledge of the 
problem under consideration. 


3.2. The Regularizing LM scheme 

In the following subsection we derive the proposed method as an approximation of the 
regularizing LM scheme na. For the subsequent derivation we consider X completed 
with the norm ■ ||x where C~^ : D{C~^) C X —X is a densely-dehned 

unbounded self-adjoint operator with compact resolvent and is dehned in terms of 
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the spectral decomposition oi C ^ (see m Section 2.1]). For the purpose of this work, 
C~^ is an operator selected a priori that enforces regularity on the functional space X. 
Introducing an operator C~^ in our formulation enable us to (i) derive the proposed 
method from the regularizing LM scheme of Hanke [H] and (ii) establish a connection 
between the classical (deterministic) and the Bayesian approach for inverse problems. 
In the Bayesian framework, C~^ is the inverse of the covariance operator C from a prior 
distribution [32] • It is important to remark that the dehnition of C~^ does not appear 
in the proposed scheme Algorithm 

The regularizing LM scheme of [H] is an iterative method that possess the 
regularizing properties need for the stable computation of solution to the identihcation 
problem described in the preceding section. In concrete Hanke in na proposes an 
iterative scheme where the Un+i iteration level is given by 

Un+i =Un + argmin ||r“^/^(|/^ - G{un) - DG{un)v\\Y + a„|(20) 

v£X 


where a > 0 is a regularization parameter chosen below and DG denotes the Frechet 
derivative of G. From the hrst order optimality conditions associated to the right hand 


side of (20) it follows that the previous expression is equivalent to 
«„+i = + (.DG'(u„} r-‘ BG(«„) + a„C-^)-^C DG'My” - G(u„}). 


( 21 ) 


The convergence and regularizing properties of the method of jM] requires that in 
(20) satishes 

P||F-V2(^17 _ GiunMY < an\\T-^^\y^ - G{Un) - DG{Un){Un+l - Un))||y (22) 


for p G (0,1) selected a priori and that the scheme is terminated whenever the nth 
iteration level satishes 


||F-V2(^i? - GiunMy <TV< ||F-V2(2/^ _ G'(n„_i))||y (23) 

for some r > 0 that satishes r > 1/p. 

The theory of Hanke provides assumptions on the forward operator under which 
the regularizing LM scheme terminates after a hnite number of iterations and the 
corresponding approximation is a solution of the inverse problem in the small noise 
limit as described in subsection 12.11 For full details of the theoretical framework for 
the regularizing LM scheme the reader is referred to the work of Hanke in na. An 
application of this method to inverse problems in the geosciences can be found in [20] . 

In order to derive the proposed ensemble method as an approximation of the LM 
scheme, we need the following lemma where we assume that Y = (recall that 
dim(y) < cx)) with the standard Euclidean inner product. 


Lemma 3.1 Assume that for any u E X, the linear functionals DmG{u) : X —)■ M 
(m = 1,... ,M/ are linearly independent. Then, expression (21) is eguivalent to 

Un+I = Un + G DG*{un){DG{un) G DG* {Un) + anT)~^{y - G{Un)) (24) 


Proof: See appendix. 
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Note that (24) can be used to rewrite (22) as follows 
p\\T-^/^{yV - G{ur,mY < an\\T^^^DGMCDG*M + anT)-\y^ - GM)\\y (25) 


Expressions (24)-(25) are computationally more convenient since the operator inversion 
of {DG{un) C DG*{un) + ttnE) is conducted on a hnite dimensional space. Upon 
discretization the aforementioned inversion has often negligible cost since the number 
of observations is typically small. Note that when X is hnite dimensional the relation 
between (21) and (24) follows simply from matrix Lemmas which cannot be applied in 
the present case. 


3.3. The proposed method as an approximation of the regularizing LM scheme 
Let us consider a hrst order approximation of G{un'^) around the ensemble mean, i.e. 

G{ul^^) ^ G{Ur,) + DG{Un){u^^^ - Un) 


Then, 


iVe 


Wn = —'^G{ul^^)^G{un) and - G(m„), w)y. 


j=i 


Dehne the following covariance operator 


Afe 




(26) 


(27) 


i=i 


From (26)-(27) and ([^ we hnd that 


We 


CTDG-(u„)v = - E„){(i,“ - u„),DG-{u..)v)x 


1 = 1 
We 


N, - 1 


— - E„){G(i,W) - G{u„),v) = GTv 


1 = 1 


and from similar arguments we obtain 

DG(u„)CrBG(H„)G.« Crv 


(28) 


(29) 


In expression (24) we now replace the following terms and the corresponding 
approximations 

(30) 

(31) 

r>G(5„)CrCG*(S„) M C"“ (32) 


Ufi y Vj 

GDG*{Un) : 
DG{Un)GDG*{Un) : 


GrDG*{un) ~ cr 


thereby obtaining 


S„+i = «„ + cr(cr + a„r)-'(9’> - iE!„) 


( 33 ) 
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that we use as the update formula for the ensemble mean of our iterative scheme. 


Similarly, from (26) - (32), the selection of a in (25) and the stopping criteria (23) 
become 

p\\T-^/\y^ - WnMY < an\\T^/\Cr + c^nT)-\y^ - w, 


Y- 


(34) 

and (§, respectively. The sequence dehned in ( p)o| satishes ( [M] ). In addition, note 
that the sample mean obtained from the ensemble updated according to formula (|^ 
satishes indeed ( [3^ . We then obtain the proposed ensemble Kalman scheme presented 
in Algorithm 


3.4- The proposed method as a seguence of linear inverse problems 

As state earlier, the proof of convergence of Algorithm [T] is beyond the scope of the 
present manuscript. Nonetheless, in this subsection we show some relevant properties 
which shed light on its regularizing effect. In order to study the aforementioned 
properties it is advantageous to rewrite the proposed algorithm in terms of the following 
augmented variables 


z = 




(36) 


the space Z = X x Y and the projection operators H and dehned hj Hz = w and 
H^z = u, respectively. It is not difficult to see that, in terms of these new variables. 
Algorithm [T] becomes 

Algorithm 2 

Let C X be the initial ensemble of Ne elements. Let p G (0,1) and r > 1/p. 

Define 


Aho) _ 

Zq — 


u 


U) 


G{u, 


Uh 


(36) 


For n = 1,... 


(1) Prediction step. Evaluate the forward map (35), 


JjJ) ^ 


and define sample mean: 


1 


Ne 


H = _L AJJ) 




i=i 


(2) Discrepancy principle. If 

\\T-^/^{y - Hzi)\\Y < TP 

stop. Output 


(37) 


(38) 


(39) 


1 


N. 


Ur, = - — - 






zz”-’' = 


Nr 


Ne 

Y 

i=i 


u 


U) 


(40) 
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Define sample covariance: 


Cr,.Z = 


Afe 


iVe-1 


^)z- 


i=i 


(3) Analysis step. Update each ensemble member as follows 
^Ua) ^ ^UJ) ^ CnH*{HCnH* + aV)-\y^ - 
where a satisfies 

p\\T-^l\f - Hzl„)\\y < - HlZ.{a))\\y 

with 


Ne 


2 “ = 




E 

i=i 




(41) 

(42) 

(43) 

(44) 


It is not difficult to see [I6] that Cn in (41) can be written as 

CUU 


a. = 


yuw 

' n '~^n 

^wu /^ww 


While the regularization properties of the proposed method will be studied 
numerically in Section]^ it is clear that the essence of the proposed regularizing Kalman 


method is the selection of a by means of (43). Since 0 < p < 1, the existence of such a 
is ensured by the following proposition 

Proposition 3.2 At every iteration of the scheme, the ensemble mean ^“(o) defined by 
satisfies 

(%) ||r-i/2^|/ - Hzf{a))\\Y -)■ ||r"i/2(|/ - HzDWy as a ^ 00. 

(a) \\T~^/‘^{y — Hzf{a))\\Y — )■ 0 as a — )■ 0. 

(Hi) The map a —>■ — Hz‘f{a))\\Y is monotonously nondecreasing 

Proof: The proof can be carried out as in [23l Theorem 2.16] □. 

Similar to the regularizing LM scheme where each iterate solves a linear inverse 


problem (see expression (20)), our ensemble Kalman scheme can be posed as Tikhonov- 


type regularization. This equivalence motivates the need to further regularize standard 
Kalman methods. 


Note that the ensemble mean of the analysis step (42) is given by 


+ C„H’(HCnH’ + ary^y” - Hzl) 


(45) 


We now show that (45) can be posed as the solution of a regularized linear inverse 
problem 

Proposition 3.3 Assume that at each iteration, the ensemble {zn’^'^}fli is linearly 


independent. Then, the ensemble mean of the analysis step (45) of the ensemble Kalman 
algorithm satisfies 

zf = argmzn^^2r.(^\\^-'^y^ - Hzi- H{z -zi))\\l + a\\z -ziWl^^ (46) 

where is the completion oflZ{Cn) with respect to the norm induced by {■,C~^-)z and 
denoted by || ■ Hc^. 
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Proof: From the definition of Cn in (41) it is clear that the rank of Cn is given by 
TZ{Cn) = span{zn’^}- Also note that Cn is a sum of rank-one operators, hence compact. 
If Ne < oo, then dim(7?.(Cn)) = Ne < oo; thus from the linearly independence of the 
initial ensemble it follows that the restriction Cn '■ TZ{Cn) —t TZ{Cn) is a positive definite 
operator. Hence its inverse C~^ : lZ{Cn) lZ{Cn) exists. Moreover, we may consider 
the restriction of H into Tl{Cn) as well as its adjoint H* : Tl(Cn) —t Y. Thus, the 
operators Cn, C~^ and H can be represented with matrices and so standard matrix 
algebra can be directly applied to show that (20) is equivalent to [33] 

(H'T-'H + aCp){Si - zD) = H'T-'iv - Hz’,) (47) 

which, from simple standard arguments is the solution to the normal equations 
associated to the minimization of 


J{z) = ||r ^ 2 {y-Hzl^-H{z-zi))\\'^ + a\\z 


- 

^n \ \C, 


■7-1 

’n 


(48) 

: 'K(Cn) K(Cn) as a 


in the space Zn- For the case A"e —)■ cx) we may consider C~ 
densely defined unbounded operator. Then a proof based on representers can then be 
carried out similarly to the proof of Lemma 3.1 □. 


From Proposition 3^ we notice that the ensemble mean at each iteration of the 
ensemble Kalman algorithm is the Tikhonov-regularized solution (in the subspace Zn 
with norm || ■ ||c„) of the following “artificial” linear inverse problem: 

given y = y'^ — Hz^ find w = z — z{^ E Zn such that y = Hw (49) 

Let us define z'^ = {ufiG{u^))'^ and tyl = z'^—z{^ where u\ we recall, is the truth. If at the 
nth iteration we were to solve the original identification problem (i.e find/approximate 
«!) that would imply to find tyl defined above. In other words, tyl is the solution to the 
inverse problem (49) and so the corresponding exact data is y^ = Hw^ which is 

= Hw^ = H{z^ - zi) = G(nt) - Hzi = y^ - ^ - Hz^ = y - ^ (50) 

where we have used ([^ and the definition of H. From the previous expression and (|^ 
it follows that 


|r-'''"(»'-») = ||r-‘''=5||v = ., 


(51) 


which implies that the noise level for the linearized inverse problem solved by is the 
same as the noise level of the original nonlinear inverse problem that we aim at solving 
by means of the proposed method. Moreover, note that before convergence is achieved, 
i.e. when ry < ||F“^/^(j/^ — Hz{fi\\Y from (43) we then have that 


V <- HH)\\y < p||r -‘/=(9 - Hz’„)\\y <- HTM)\\y- (52) 

Thus, at each iteration, the selection of a in ( [4^ honors the discrepancy principle for 
the artificial inverse problem that the ensemble mean solves at each iteration of the 
scheme. 


Remark 3.1 From the augmented version of the proposed ensemble Kalman method we 
may appreciate more clearly the role of the parameter p. Indeed, note from Proposition 
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3.2 that a value of p 'zz 1 (p < 1) in (43) will results in a very large a and so the 
estimate 7'“(q!) will move slightly from the previous estimate zl{a). Then, a selection 
of T ^ 1/p, (t > 1/p) for p ^ 1 will enable us to allow the algorithm to progress until 
the data misfit is close to the noise level. In contrast, a smaller p results in a smaller a 
and thus a less controlled estimate z/[(a). Therefore, choosing r ~ 1/p, (r > 1/p) for 


smaller p implies that we need to stop the algorithm via (39) much earlier to prevent 
the data overfitting as previously discussed. While it seems clear that a choice o/ p ~ 1 
will result in a more stabilized scheme, it may be also detrimental to the performance of 
the scheme. In Section we investigate, with numerical experiments, practical choices 
for the parameter p. 


3.5. The connection with the Bayesian framework 

While here we propose an ensemble Kalman method as a derivative-free regularizing 
method for classical inverse problems, most Kalman-based approaches are posed in a 
Bayesian inference framework where the objective is to produce an ensemble from which 
statistical information from the Bayesian posterior can be computed. In order to fully 
understand the underlying motivation of our method it is instructive to consider the role 
of the proposed method in the context of Bayesian inversion. Let us then assume that 
the noise in ([^ centered Gaussian noise with covariance L. Additionally, assume that 
there is an underlying Gaussian prior distribution N{u,C) of the unknown u, where the 
mean and covariance of such distribution are u and C respectively. Ghoose an initial 
ensemble generated from samples of this distribution, i.e. ~ N{u,C). Let us now 
consider the first iteration n = 0 of Algorithm [T] with a hxed parameter a = 1 and 
replace in (j^ by y^I^ = y"^ + where ~ A^(0,r). Then, Algorithmbecomes 


u 


U) 


= u, 


U) 


•0 +cr(c„”” + r)-‘(!,«-G('C)) (63) 

which is also the standard formula for the so-called Ensemble Smoother |T3]. If the 
forward operator G is linear then it can be shown ng that the updated ensemble 
fully characterizes, as Nf, —)■ oo, the conditional probability of u given the data 


y'^. In other words, (53) provides an ensemble approximation of the Bayesian posterior. 


Moreover, the mean of the ensemble converges, in the limit ^ oo, to the maximum a 
posterior estimate, i.e. the maximizer of the aforementioned posterior distribution. The 
linear-Gaussian case is trivial in the sense that the posterior is Gaussian and it can be 
easily characterized with its mean and covariance. However, when the forward operator 
G is nonlinear, such as the ones described in subsection |2.3 the Bayesian posterior is 
in general non-Gaussian and the ensemble smoother provides only an approximation 
whose convergence theory, to our best knowledge, is nonexistent. It has been often 


reported, however, than when straight forward application of (53) are used in the 


general nonlinear case, inaccurate approximations of the Bayesian posterior may be 
obtained izn. In a recent publication [19], we have demonstrated that the application 
of iterative regularization methods like the ones proposed here may improve the accuracy 
of ensemble methods for approximation of the statistical inverse problem. The present 
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work follows the classical approach; we endow an intrinsically Bayesian method with the 
regularizing properties needed to address deterministic inverse problem in a derivative- 
free easy-to-implement computational framework. 

3.6. Small ensemble size ill-conditioning 

As stated above, issues of stability of ensemble Kalman methods for data assimilation 
have been often reported. Ad-hoc fixes such as covariance localization and covariance 
inflation have become standard practice to address such lack of stability which have 
been typically associated and more noted with implementations of small ensemble size 
with respect to the number of measurements [1]. Indeed, note that since dim(K) = 
M < oo,then dim(7?.(C^"')) = mm{Ne,M}. Clearly, when < M, is singular 
and, even though (C^'^ -|- is strictly positive dehnite, it could potentially have a 
large conditioning number for some choices of F, which is in turn selected according 
to measurement information. This type ill-conditioning when Ng < M has the similar 
effect of the ill-posedness of the Frechet derivative of the forward map in the regularizing 
LM scheme. Thus, it comes as no surprise that the application of ideas from iterative 
regularization techniques can be useful for addressing the ill-conditioning of the method 
when used with a small ensemble size with respect to the number of measurements. 
From either sources of ill-conditioning, it is clear that ensemble Kalman methods need 
regularization and, to our best knowledge, the standard fixes indicated above do not 
possess a mathematical framework which provides general guidelines for developing 
robust implementations. In contrast, iterative regularization has a rigorous general 
mathematical ground; our goal here is to use it as the tool to empower ensemble Kalman 
methods with the stabilization/regularization required to applied them as accurate 
robust but computationally tractable solvers for large-scale PDE-constrained inverse 
problems. 


3. 7. Computational aspects of the scheme 


Under our assumption that the number of measurements is small compared to the 
dimension of the (discretized) parameters space, the main computational cost of the 
proposed scheme is in the evaluation of the forward map in the prediction step ([^. The 
construction of as well as the inversion of -t-ctriF)”^ are negligible for the 

number of measurement considered for the present applications.Therefore, the cost of 
Algorithmic is mainly n*Ng forward model evaluations where n* is the stopping iteration 
determined by (43). It is clear that Algorithm [C is computationally feasible provided 
that, with a few Ng ensemble members, a reasonably accurate estimate can be achieved 
in only a few iterations. In Section |C we provide an extensive numerical investigation of 
the performance and computational feasibility of the scheme. In some cases, a relatively 
large number of ensembles are required to achieve stable computations with reasonable 
accuracy. 
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By means of numerical experiments, in this section we investigate the convergence 
and regularizing properties of the iterative regularizing ensemble Kalman method. 


We consider the identihcation problems on the test models from subsection 2.3 and 


conduct synthetic experiments where we apply the proposed scheme with synthetic 
data generated with a true parameter. Algorithm then produces an estimate (the 
ensemble mean (|^) which we compare against the truth in order to assess the accuracy 
of the scheme. By conducting synthetic experiments, our aim is to study the accuracy, 
convergence and regularizing properties of the proposed methods with respect to 

• ensemble size W 

• tunable parameters p and r 

• number of measurements M 

• noise level r] 

• selection of initial ensemble 


4-1. Test Model I. Darcy flow 

We now describe the generation of synthetic data from the Darcy flow model of 
subsection 2.3. 1[ We consider the identihcation of the log of the “true” hydraulic 
conductivity of Figure]^ (left). The true conductivity is a random held drawn from 
a Gaussian measure A^(0,C') with covariance C : L‘^{D) —)■ L^{D) dehned by 


C(p{x) = / c{x,y)(p{y)dy 
Jd 


(54) 


with a spherical covariance function 
c{x,y) = 


Co 


1 _ 3ft I 
^ 2 a “C 2 a3 


(55) 


if h = \x — y\ < a 
0 if h > a 

where a > 0 is the correlation length and Cq > 0 is the variance of the held. Gaussian 
distributions with covariance operators of this type are often used to model the geologic 
properties of some formations [9]. The true log conductivity held is used in equations 
(inii-(lil to hnd the true head held displayed in Figure (middle). The latter is then 
used in (14) where the measurement locations {xm}m=i be specihed below. Then, 


Gaussian noise is added to the synthetic data; the norm of the noise is 1% of the 
norm of the data. The Darcy how model is solved numerically with cell-centered hnite 
diherences |29]. The matrix F is simply a diagonal matrix with entries proportional to 
the size of the corresponding entry of G(m^). In order to avoid inverse crimes. Algorithm 
is applied on a coarser grid (of 80 x 80 cells) than the one (of 160 x 160 cells) used for 
the generation of the synthetic data. 

For the experiments of this section, we apply Algorithm with an initial ensemble 
that consists of N^. (to be dehned below) random helds drawn from the same Gaussian 
measure A^(0, C) that we use to generate the truth. Some members of the initial 
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ensemble are displayed in Figure The generation of these fields (as well as the 
truth) is conducted by means of the Karhunen-Loeve (KL) expansion of a random 
field distributed according to A^(0,C'). Although the truth and the prior ensemble are 
samples from the same probability distribution, note that (i) the truth is generated on 
a much finer grid and (ii) the truth (or its projection on the coarser grid) is not part of 
the initial ensemble. 

The motivation for selecting the truth as a realization from the same distribution 
from the initial ensemble is twofold. On the one hand, in subsurface applications 
there is often prior knowledge of the geologic properties in terms of a prior probability 
distribution for the truth. It is then natural, as in the Bayesian approach, to use this 
distribution to generate the initial ensemble. On the other hand, by choosing the truth 
and the initial ensemble form the same probability distribution we expect to reduce 
(although not completely eliminate) the effect that the selection of an initial ensemble 
has on the performance of the proposed scheme. More precisely, from the invariance 
subspace property (Proposition 3.1) we know that the approximation provided by our 
method lives in the subspace generated by the initial ensemble. Therefore, each initial 
ensemble will produce a different estimate. However, by generating the initial ensemble 
from the same distribution that we use to generate the truth, our initial ensembles, while 
different from the truth, have the same regularity and spatial structure (e.g. correlation) 
of the true field. In the following subsection we study an inverse problem where the 
truth is a prescribed field with no association whatsoever to the initial ensemble and 
there we study the effect that the spatial correlation of the initial ensemble has on the 
accuracy of the proposed ensemble method to identify the truth. 

While the initial ensemble generated as described above will provide estimates 
with the same spatial features of the truth, from the aforementioned subspace property 
we certainly expect that the results form Algorithm will vary with the selection 
of the ensemble. Therefore, in order to fully assess the numerical properties of this 
algorithm without the influence of the initial ensemble, for the present work we will 
conduct multiple experiments corresponding to different choices of the initial ensemble 
and provide averages over these experiments and thus reliable quantities related to the 
average performance of the scheme. It is worth reiterating that the initial ensemble is a 
design parameter that can be chosen in various ways according to available information. 


4 . 1 . 1 . Effect of the ensemble size N^. We generate synthetic data as described below 
with the array of 100 measurement locations displayed in Figure (right). We use 
these data in Algorithm with a (fixed) parameter p = 0.7. In order to appreciate the 
effect of the early termination of the scheme, the algorithm is allowed to progress even 
when the data misfit goes below p/p (recall the stopping criteria (|^ requires r > 1/p). 
In the top (resp. bottom) of Figure we plot the relative error with respect to the 
truth \\un — |l2(d)/||l2(d) (resp. log - data misfit) from 40 different experiments 

corresponding to different selection of the initial ensemble (recall each initial ensemble 
is a set of draws from a the Gaussian distribution defined above). Different panels 
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Truelog(K) True h Measurement locations 



Figure 3. Left: true log-conductivity. Middle: log-data misfit. Right: relative error with 
respect to the truth 



Figure 4. Some elements from the initial ensemble. 


corresponds to different selections of ensemble size Ng. Each of the blue curves (resp. 
red curves) represents the log-data misht (resp. error with respect to the truth) that 
we obtain from the ensemble mean Un (expression (|^) computed with Algorithm 
initialized with each of the 40 initial ensembles mentioned earlier. We reiterate that by 
“data misht” we mean the misht between data and the average of the model outputs 
from the ensemble dehned in (|^ which is, in turn, the quantity monitored for the 
convergence of the scheme. 

The green curve in the bottom (resp. top) of Figure represents the log-data 
misht (resp. error w.r.t truth) averaged, at each iteration, over the 40 experiments 
from diherent initial ensembles. The dotted vertical line dehnes the iteration number 
after which the averaged relative error w.r.t. the truth starts increasing. The dotted 
horizontal line in Figure]^ (bottom) indicates the log of the value of rj/p. The solid 
red line indicates the value of the noise level rj. As we expected, these results conhrm 
that Algorithm reduces the relative error w.r.t. truth. However, after some number 
of iterations, this error will start increasing unless the algorithm is stopped. The results 
from Figure reveals that there is a critical ensemble size for which, on average (over 
several experiments with different initial ensembles), the discrepancy principle (|^, with 
T ^ 1/p is a. reliable stopping criteria which terminates the algorithm before the error 
w.r.t. the truth increases due to data overhtting. In other words, the stability that the 
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scheme inherits from the regularizing LM scheme depends on the ensemble size. For this 
experiment the critical ensemble size is Nf. = 125. In average, for N^. < 125 the error 
with respect to the truth of the ensemble mean will increase before the data misfit has 
dropped below rj/p. For Ne > 125 we observe that the error will increase when the data 
misfit has reached the value in r^/p as we would expect if the regularization properties 
were inherited from the regularizing LM scheme. For larger ensemble sizes Ne > 200 
the relative error with respect to the truth will not increase even when the data misht 
takes values below p/p but a slow increase starts showing when the data misfit drops 
below the noise level. 

We also note from Figure that, as we increase the ensemble size, the ensemble 
mean Un provides a more accurate approximation of the truth (provided the scheme is 
properly stopped). However, a further increase in the ensemble size results in higher 
computational cost of the scheme. In Figure we show the ensemble mean that we 
obtain, when Algorithm is initialized with different ensemble sizes Ne and the scheme 
stopped according to ([^ with r = 1/p. For this result, the members of the smaller 
initial ensembles are contained in the larger ones. We can clearly appreciate that for 
smaller ensemble sizes, the stabilization of the scheme may be lost and thus its accuracy. 






Figure 5. Relative error w.r.t. the truth (top) and log - data misfit (bottom) from 40 different 
experiments with p = 0.7 associated to different initial ensembles of size (from left to right) 
Ne = 75,100,125,200,400 


4-1.2. Effect of the number of measurements. We now investigate the relation between 
the number of measurements and the critical size that enable us to observe the 
regularization properties inherited by the regularizing LM scheme. As discussed in 
subsection |2.4| stability issues for Ne < M may arise unless the parameter controls 
the inversion in ([^. We recall that the selection of according to (34) is inspired by 
the regularizing LM scheme which addresses a different type of ill-posedness (although 
is reflected in a similar fashion). Thus, this selection of may not necessarily address 
the stability issues that arise in the ensemble method when Ne < M. In the following 
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Iteration: 15 Error: 0.91 Iteration: 15 Error: 0.70 Iteration: 15 Error: 0.63 Iteration: 13 Error: 0.58 



True log(K) 



Figure 6. Log - conductivity estimates obtained from an initial ensemble size of (from left 
to right) 75, 100, 150, 200 (the larger ensembles contains the smaller ones). These estimates 
are the ensemble mean obtained from Algorithm with r = 1/p in Q. 


experiments we show that under some conditions, the selection of and the stopping 
criteria does indeed resolve the stability issues even when Ng < M. 

We conduct a set of experiments, each of them similar to the one described in 
the preceding subsection but with synthetic data generated from different measurement 
conhgurations. For each measurement conhguration and ensemble size, we conduct a 
set of 40 experiments corresponding to different initial ensembles generated as described 
earlier. Then, log-data misht and relative error with respect to the truth were computed 
and averaged (over the 40 experiments) at each iteration of the scheme. For clarity we 
only report these averaged quantities corresponding to each ensemble size and each 
measurement conhgurations. In the left column of Figure and Figure we display 
the measurement conhguration for each experiment. The middle and right columns of 
Figures and show the (averaged over diherent experiments) log data misht and error 
w.r.t truth for four diherent ensemble sizes (increasing from top to bottom in the legend 
of these Figures). The dotted horizontal line in the middle columns of Figure and 
Figure denotes the value of the data misht corresponding to rj/p. The vertical line 
in the middle and right columns indicate the iteration after which the error w.r.t truth 
increases for the second choice (from to top to bottom) of ensemble size displayed on the 
legend of these hgures. It is important to note that, irrespective of the ensemble size, the 
proposed algorithm does stabilize the estimates obtained in the early iterations. In other 
words, it avoids uncontrolled estimates that some standard unregularized Kalman-based 
methods may display in the hrst couple of iterations. However, the relation between 
ensemble size and number of measurements has severe ehects on the ability of the 
stopping criteria ([^ to successfully terminate the algorithm under the guidelines that 
arise from the application of iterative regularization (i.e. with r cs 1/p). Let us, for 
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example, examine the case M < 100 reported in Figure For smaller measurement 
locations (i.e. M = 25,36), an ensemble size equal to the number of measurements 
Nf. = M, on average, will provide an estimate whose error with respect to the truth will 
decrease in the hrst iterations but then starts increasing long before the data-misht has 
reached the value rj/p. A similar but less drastic behavior is observed for larger number 
of measurements {M = 49,64). However, as we increase M, we hnd that a selection 
Nf. = M yields results that seemed to be stabilized whenever the scheme is stopped 
according to (|^ with r = 1/p. Note that for these measurement conhgurations with 
M < 100, the ensemble size needs to be larger than the number of measurements so that 
the computational solution is properly stabilized with the proposed stopping criteria. 

Let us consider now the case M > 100 reported in Figure In contrast to the 
previous case where an ensemble of a size larger than that number of measurements is 
needed for the stabilization of the computations, here certain choices of iVe with Ne < M 
may suffice. Note for example that for M = 225, an ensemble with = 169 will result 
in stable computations by using ([^ with r 1/p. Interestingly, for larger M it is 
more evident that the selection iVe > M is unnecessary for the algorithm to exhibit 
the regularization properties based on the stopping criteria mentioned above. Indeed, 
for M = 400 and N = 484 we note that Ne = 196 and Ne = 225 will clearly provide 
regularized estimates in the sense that their error will not increase before the data misht 
reaches the value p/p. For these cases with M > 100, a selection of iVe > M will clearly 
result in not only stable but an accurate identihcation. However, it is worth reiterating 
that for large-scale applications, the smallest ensemble size that produce a regularized 
solution is desirable due to the high computational cost of the forward simulations. In 
summary, there seems to be an apparent critical number of measurements (M = 100) 
below of which a selection oiNe> M is needed for the proposed method to fully stabilize 
the scheme. However, for larger number of measurements the regularizing properties 
of the scheme enable us to use a selection oi Ne < M which results in a reasonable 
computational cost. 


4.1.3. The tunable parameters p. As we discussed in subsection |3.4[ the parameter p 
controls the ensemble updates. The choice of this parameter has a signihcant effect on 
both accuracy and cost of the proposed scheme. We now investigate this effect with 
our Darcy flow model with the measurement conhguration of 100 locations displayed 
in Figure]^ (right). The results from the preceding section indicate that an ensemble 
of size Ne > 100 will provide the stabilization needed with the choice r ~ 1/p in 
the stopping criteria. We select Ne = 150 for conducting a set of experiments where 
Algorithm [T] is applied with several choices of p. For each choices of p, the experiment 
is repeated 40 times with a different selection of the initial ensemble generated as we 
previously described. In Figure we display, for these 40 experiments, the log-data 
misht (bottom) and the relative error with respect to the truth (top) of the ensemble 
mean obtained with the proposed scheme with different choices of p. The horizontal 
dotted line in Figure (bottom) represent the log of p/p. Note that the stopping 
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Figure 7. Numerical results with the proposed method for p = 0.7 and different measurement 
configurations and ensemble sizes. Left column: measurement configuration. Middle column: 
relative error w.r.t. the truth. Right column: Log - data misfit. Quantities are averaged at 
each iteration, over 40 experiments corresponding to different experiment. 
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Figure 8. Numerical results with the proposed method for p = 0.7 and different measurement 
configurations and ensemble sizes. Left column: measurement configuration. Middle column: 
relative error w.r.t. the truth. Right column: Log - data misfit. Quantities are averaged at 
each iteration, over 40 experiments corresponding to different experiment. 
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criteria 0 with T zz Ijp provides a reasonable stabilization of algorithm. For p < 0.7 
we observe that the error with respect to the truth increases quite rapidly when the data 
misht drops below the value rj/p. We note that, on average, there is a slight decrease 
of the error with respect to the truth as we increase p. However, this slight increase in 
the accuracy of the identihcation has associated an increase in the computational cost. 
For the Darcy flow model, our experiments suggest that p = 0.7 represents a reasonable 
choice in terms of accuracy and cost. 



Figure 9. Regularizing ensemble method applied with = 150, 100 measurements and p 
(from left to right) p = 0.5,0.6,0.7,0.8,0.85. Top row: relative error w.r.t. truth. Bottom 
row: Log - data misfit. 


4-1.4- Regularizing properties of the proposed method in the small-noise limit In this 
section we conduct experiments to study the regularization properties of the proposed 
scheme in the small noise limit. The previous experiments show that with a reasonable 
selection of the ensemble size with respect to the measurement conhgurations, the 
mean of the proposed ensemble method achieve stable approximations of the truth. 
In other words, the early termination of the scheme, say at iteration n*, yields an 
approximation = u^* given by the mean of the ensemble at the stopped iteration. As 
discussed earlier, the aim of our iterative regularizing algorithm is to provide a stable 
approximation in the sense that, as p —)■ 0, —)■ m where G{u) = G{v)). Due to the lack 
of uniqueness of the identihcation problem we cannot claim that u = (the estimate 
converges to the truth). Nevertheless, we expect that the elements u E X such that 
G{u) = G{u^) will possess similar spacial features of the truth. 

We consider again the conhguration with M = 100 as before and we vary the 
ensemble size. In Figure 10 we show the relative error with respect to the truth (top) and 
the log-data misht (bottom) obtained for each ensemble size with a diherent selection 
of noise levels. The level of noise is selected so that the norm of the noise relative to the 
data is of the percentage indicated in the plots. The matrix F as selected as described 
earlier but kept hxed for these experiments associated to diherent noise levels. For this 
particular experiment the algorithm is stopped according to 0 with r = 1/p. We note 
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that for the smallest noise considered here 0.5% an ensemble of size Ne = 150 was needed 
to properly stabilize the compntations. Again, this shows that the proposed method, 
for snfficient large ensemble, inherits the regnlarization properties of the regnlarization 
LM scheme of [H]. 



Figure 10. Log - data misfit (bottom) and relative error w.r.t. the truth (top) from 40 
experiments with different initial ensembles of size of (from left to right) 75, 100, 150, 200 and 
different choices of noise level. 


4-1.5. Comparison with the regularizing LM scheme. In Fignre 11 we display the 
numerical results from the application of the proposed scheme to identify the log- 
conductivity described in the preceding subsections. As before, we used 40 different 
experiments corresponding to different initial ensembles of size = 150. In this 
case, however, we compare with the regularizing LM scheme applied to the solution 
of the parameter identihcation but constrained to the subspace generated by the 
initial ensemble. The numerical results displayed in Figure [T^ suggests that the 
proposed ensemble Kalman method produces a derivative-free approximation of the 
regularizing LM scheme where the Frechet derivative is approximated by the ensemble 
covariance. Moreover, these results show that the proposed ensemble Kalman algorithm 
is minimizing the data misht is a stable fashion. 


4-2. Test Model II: BIT 

In this section we investigate the performance of the regularizing ensemble Kalman 


method for the solution of the FIT problem described in subsection 2.3.2 Similar to the 
previous subsection, our aim is to understand fundamental aspects of the regularizing 
properties of the scheme. In particular, in this subsection our aim is to observe the 
effect of the selection of the initial ensemble on the accuracy and cost of the proposed 
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Figure 11. Log-data misfit (left) and rel. error w.r.t truth (right) from estimates obtained 
with the regularizing LM scheme and the proposed ensemble method (Results from 50 
experiments with different initial ensembles of size Ne = 150). 


technique. In contrast to the previous subsection where the true unknown held was a 
random draw from a distribution that we used as well for the generation of the initial 
ensemble, here we prescribe a conductivity k that consist of three cicular inclusions on a 
circular domain D of unit radius similar to the one used in [28] . This true conductivity 
is displayed in Figure along with the conhguration of 16 electrodes used in the CEM 
described in subsection |2. 3. 2[ We consider 15 current patterns where current is injected 
between a pair of adjacent electrodes. For each current pattern we collect measurements 
on all the electrodes thereby having M = 240 observations. We choose a value Zk = 0.01 
for the contact impedances of all the electrodes. 


In the middle and middle-right of Figure we show some of the true voltages 
obtained with the FEM solver from EIDORS MATLAB framework |2|. Synthetic data 
were generated from the aforementioned voltages by adding Gaussian random noise of 
standard deviation of 2% of the signal. Inverse crimes are avoided by using a hner mesh 
(with 7744 elements) than the one used for the application of Algorithm (mesh of 
6400 elements). The experiments in this subsection are focused on the electrode and 
measurement conhguration described earlier. 

Note that the experiments from this subsection comprise the more general case 
where there is no characterization of the truth in terms of a probability distribution. In 
this case, we still consider an initial ensemble generated from samples from a Gaussian 
measure A^(0,C'). However, this probability distribution is completely artihcial and 
dehned only for the purpose of generating an initial ensemble. In other words, we 
assume no link between the truth and our choice of initial ensemble. We consider the 
covariance operator that arises from the Wittle-Matter correlation function [28|. In 2D, 
such covariance operator takes the form 


C = ujL\1- L^A) 


-e 


(56) 
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Figure 12. Left: true log-conductivity. Middle and Right: Two of the 15 voltage simulations 
patterns computed with the true conductivity. 


where A is the Laplacian operator, L is a correlation length. For the purpose of this 
work we regard 6 as & parameter that controls the regularity of the samples and a; as a 
scaling constant. Some samples from such distribution with parameter L = 0.2, 6 = 5 
and u = 0.1 are displayed in Figure 13 As before, samples from N{0,C) are obtained 
by means of the KL expansion of random fields N{0,C). Samples are generated on a 
regular rectangular domain that contains the computational domain shown in Figure [T^ 
and projected on the FEM domain used for the computation of the CEM. 





Figure 13. Draws from a Gaussian prior of the form (56) with L 


0.2 and 6 = 5. 


4-2.1. Ensemble size Ne, tunable parameter p and small noise limit. In subsection 4.1 


we have extensively studied the role of ensemble size and tunable parameters. The aim 
here is to briefly verify that similar results are obtained when Algorithm [T] is applied 
to the EIT problem. For a hxed parameter p = 0.5 we conduct a set of experiments 
where the scheme is applied to 40 different initial ensembles generated from the Gaussian 
measure described above. The corresponding log-data misfit (bottom) and relative error 
(top) from these experiments are displayed in Figure [14} As with the previous test model, 
the ensemble size is crucial to the performance of the proposed iterative scheme. We 
observe a critical ensemble size of N^, = 75 above which the proposed scheme is properly 
stabilized when the early termination is carried out according to (j^ with the selection 
oi T zs 1/p provided by the theory of the regularizing LM scheme. In this case we 
note again that the critical size above which the method exhibits regularizing properties 
was smaller than the number of measurements; this conhrms that the proposed scheme 
addresses the small ensemble effect described in subsection 12.41 The ensemble mean 
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from 5 experiments (from different initial ensembles) with iVe = 100 are displayed in 


Figure 15 


In Figure 1^ we fix = 100 and now perform a set of experiments with different 
selections of p in (34). As before, we confirm that for this sufficiently large ensemble size, 
the selection of r > 1/p in the termination of the scheme ensures stabilized estimates. 
It is worth noticing that the aforementioned selection of r is essential as the relative 
error increases abruptly once the data misfit drops below the value p/p. While there 
is a slight increase in the accuracy (in terms of the error w.r.t truth) as we increase 
p, we clearly observe that the number of iterations and so the computational cost of 
the scheme increases substantially. For this case we observe that p = 0.5 provides a 
reasonable balance between computational cost and accuracy. 

Finally, in Figurej^we show the effect of the stabilization of the scheme as the noise 
decreases. These results corresponds to the application of the scheme with synthetic 
data generated from the electrode configuration described before but with different noise 
levels. The results from Figure Ff corresponds to averages, at each iteration, from 40 
experiments obtained from 40 different selections of initial ensembles. As we expected, 
the proposed ensemble scheme inherits the regularizing properties of the LM method 
and produces stable computations that converge, in a stable fashion as p —)■ 0. 



rtera&on 


Iteration 


Iteration 


Iteration 


iteration 







Figure 14. Relative error w.r.t. truth (top) and log - data misfit (bottom) obtained from 
Algorithm with (from left to right) = 50, 60, 75,150, 250. These plots display the results 
from 40 experiments with different initial ensembles. 


4-2.2. The selection of the initial ensemble As we have indicated earlier, the proposed 
scheme is highly dependent on the selection of the initial ensemble. In the experiments 
from the previous subsection, we attempt to reduce such dependence by selecting the 
initial ensemble from the same distribution that we use to generate the truth. This 
selection on the initial ensemble, from Proposition 3T ensures that the estimate from 
Algorithm has the same regularity/spatial structure of the truth. We wish now to 
investigate the effect of the selection of the initial ensemble on the estimates produced 
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Figure 15. Top: True log-conductivity. Middle and bottom: Estimates of log-conductivity 
obtained from 5 experiments with different initial ensembles. The proposed method is used 
with p = 0.5 and Nf. = 100. 




Figure 16. Error w.r.t. truth (top) and log - data misfit (bottom) obtained with Algorithm 
with (from left to right) p = 0.3,0.5,0.7,0.8 from 40 experiments with different initial 
ensembles of size = 100 


by our method. We therefore require a systematic way to generate substantially different 
initial ensembles. This can be achieved, for example, by modifying the parameters L and 
0 in (56) that, in turn, control the spatial correlation and regularity of the samples that 
we use as members of such initial ensemble. For simplicity we focus only on the selection 
of L and consider initial ensembles with the covariance (56) for different choices of L. 
In Figure 18 we present some samples generated with several choices of L (from left to 
right: L = 1,0.2,0.1,0.06,0.05). Each row corresponds to a fixed set of realizations of 
coefficients in the KL expansion that we use to generate those Gaussian fields. We can 
observe visually how the spatial correlation of the samples decreases with L. 

We apply Algorithm [T] for the solution of the EIT problem with an ensemble of 
size Nf. = 100 generated from the Gaussian measure described earlier with different 
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Figure 17. Log data misfit (right) and error w.r.t truth (left) from the average over 40 
experiments where Algorithm was applied with different initial ensembles to synthetic data 
of noise levels of 1%, 5%, 7.5%, 10%15%. 






X\ 




f 


' I: 


Figure 18. Draws from a Gaussian prior of the form (56) with (from left to right) 
L = 1,0.2,0.1, 0.06, 0.05. Each row corresponds to a different realization of the KL coefficients 
in the parameterization of the Gaussian field. 


choices of parameter L. For each value of L, we report the results from 40 experiments 
corresponding to different initial ensembles (i.e. different draws from the Gaussian 
measure). Figure 19 displays the resulting log-data misfit (bottom) and error w.r.t 
truth (top) . The selection of the initial ensemble based on L has a signihcant effect 
on the performance of the method. We note that the accuracy degrades as we decrease 
the correlation length. For the largest correlation length considered here [L = 1) we 
note that after a few iterations, before the data misfit reaches the value r^/p, the relative 
error shows a slight increase which results in large fluctuations of the data misfit. As 
we decrease the correlation length the relative error is stabilized and the data misfit 
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reduced. However, we note that the accuracy decreases as we use initial ensembles 
generated with members that have smaller correlation lengths (L < 0.1). Indeed, for 
L = 0.05, the relative error with respect to the truth increases before the data misfit 
drops below rj/p,. It is thus clear that the selection of the initial ensemble (in this case 
based on L) affects the regularizing properties of the scheme. 

In Figure 20 we display estimates (i.e. ensemble mean of the log-conductivity 
obtained from one of the the previous experiments for the different choices of L. More 
concretely, these are the estimates obtained with different initial ensemble corresponding 
to different choices of L but generated with the same realization of KL coefficients. 
From the invariance subspace property of the proposed scheme, it comes as no surprise 
that an initial ensemble of elements that have very small spatial correlation cannot 
possibly produce an estimate that characterizes the truth. Indeed, the inclusions of high 
conductivity from the truth have a diameter of approximately 0.3 which can be better 
identified when the initial ensemble is generated from smooth fields with a similar spatial 
correlation. Similar dependence on correlation lengths were obtained in the Bayesian 
level-set approach of HZI. While prior knowledge of the truth can be certainly used for 
the construction of the initial ensemble, further research should address the potential 
identification of parameters that determine the regularity of the initial ensemble that 
better characterize the truth. 



Figure 19. Error with respect to the truth (top) and log data misfit (bottom) from 40 
experiments with different initial ensembles generated from samples of A^(0, C) with values of 
L in (56) (from left to right) L = 0.2,0.1,0.06,0.04,0.03. 


5. Applications to geometric inverse problems 

The numerical investigations of the preceding sections establishes that the proposed 
ensemble Kalman method, with a sufficiently large ensemble size, inherits the 
regularizing properties of the regularizing LM approach of [TTj. Thus, our scheme can be 
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Figure 20. Top: Estimates of level-set obtained from 5 experiments with different initial 
ensembles with samples of N (0, C) with values of L in (56) (from left to right) L = 
0.2,0.1,0.06,0.05. Bottom: True conductivity. 


potentially applied to produce stable and accurate estimates of PDE-constrained inverse 
problems. Moreover, those estimates can be computed at a reasonable computational 
cost without the need of the derivative of the forward map. In this section, we show the 
potential application of the proposed method to solve identihcation problems where the 
computation of such derivative can be very cumbersome. In concrete, in this section we 
investigate the application of the proposed regularizing ensemble Kalman method for 
the solution of shape identihcation problems. 

We are interested in the identihcation of an unknown region within D, the spatial 
domain of dehnition of the PDE under consideration, whose boundary dQ dehnes a sharp 
discontinuity of an unknown parameter in the PDE model. That is the case, for example, 
in subsurface how applications where the conductivity of the aquifer/reservoir may take 
substantially diherent (by several orders of magnitude) values on facies characterized 
with diherent geologies. While the nominal values on such geologic facies may be known 
a priori, the interface between geologic facies is usually unknown. 

A common approach to parameterize an unknown interface (or shape) is to use an 
implicitly parameterization in terms of level-sets [2B]. In other words, we may assume 
that 

n = {xED\ u{x) < 0} (57) 

where here u denotes the level-set function. The zero level-set provides the unknown 
interface which is now parameterized by a function u and thus the identihcation can 
be posed within a functional-analytical framework. Level-set-based methods for shape 
identihcation problems have been extensively applied in the literature [g El HH EQ]. 
However, to our best of knowledge, most of these approaches use a variational framework 
for solving shape-based least-squares problems which, in turn, requires computation 
of the shape derivatives of the forward map. In addition, these standard variational 
formulations require the solution of the level-set equation to evolve the shape so that 
it minimizes the, possibly regularized, squared data misht. This approach often leads 
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to computations of flat level-set functions that need to be redefined. In this section 
we show that the proposed ensemble Kalman method can be used to approximate 
the solution to identification problems without using neither the shape derivative of 
forward maps nor the level-set equation. More precisely, we apply the proposed iterative 
ensemble Kalman method for the solution of identification problems where the interface 
is parameterized by (57). The update formulate (|^ with the controlled/regularized 
selection of a induces the motion of the shape. Moreover, we propose to apply 
the Kalman method with a selection of the initial ensemble of level-sets that consist 
of samples from Gaussian measures which a covariance that enforces some 

smoothness of the level-set function and could potentially incorporate some intrinsic 
correlation length. Nonetheless, this probabilistic approach for the generation of the 
initial ensemble is conducted artificially for the sake of the implementation. However, it 
has been recently shown na that there is zero probability of generating samples (and 
thus initial ensembles) corresponding to flat level-set functions. This, combined with 
the invariance subspace property of the proposed method, ensures that our estimates 
do not become flat thus overcoming the common issue of the flattening of the level-set 
function typical of standard level-set approaches. 


5.1. Estimation of geologic facies 

We are interested in the identification of geologic facies in the test model described 
More precisely we want to estimate the region of high conductivity G 


in section 2.3 


(represented by (57)) in an aquifer D. For this example we prescribe a true conductivity 
displayed in Figure 21 (middle). This true conductivity simulates a typical layered 


structure in the geologic properties of an aquifer. We use in (0-0 to generate 
synthetic data that we wish to invert in order to identify the true conductivity. The 
measurement conhguration is displayed in Figure 21 (right). As before, inverse crimes 


are avoided by using a finer grid for the generation of the truth than the one used for 
the initial ensemble and so for the inversion. 

Our goal is now to apply the proposed ensemble Kalman method on the variable u 
corresponding to the level-set function that parametrizes k as follows 

k{u) = KiXu<0 + KeXu>0, (58) 

where xa denotes the characteristic function of region A and Ki and Ke are the positive 
constants that define the high and low conductivity values of the true conductivity 
Figure!^ (left). We assume these two constants are known; the unknown is the interface 
between the regions of different conductivity. 

As described earlier we prescribe an artificial Gaussian measure from which we 
generate members of an initial ensemble of level-set functions. More concretely we 
consider samples from A^(0,C') with 

C = u(-A)-^ (59) 

where A is the Laplacian operator, a; is a scaling constant and 0 controls the regularity. 


In contrast to (56), the previous expressions defines a covariance with an intrinsic 
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fixed correlation length that we are not able to vary. In Fignre 22 (top row) we show 
some samples from the prior distribntion and in Fignre 22 (bottom row) we show the 
corresponding condnctivities obtained from (58). 


True log(K) 


True p 


Measurement locations 



Figure 21. Left: True hydraulic conductivity. Middle: true hydraulic head. Right: 
Measurement locations. 
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Figure 22. Top: Different realizations from level-set functions sampled of N(0,C) with C 


from (59). Bottom: Conductivities computed from these sample level-set functions (by means 
of (pi). 


We apply the regularizing ensemble Kalman method proposed with different 
ensemble sizes (W = 75,100,150,250,350) and with a fixed parameter p = 0.7. As 
before, we consider 40 experiments with different initial ensembles and the corresponding 
log-data misfit and error with respect to the truth are displayed in Figure Although 
we are applying the ensemble algorithm to the estimation of the level-set function 
M, we consider the relative error with respect to the truth of the corresponding 
conductivity given by (58). Note that a large ensemble is needed in order to obtain stable 
computations in the scheme until the data misfit reaches the value p/p that we use to 
stop the algorithm. This value is indicated with the dotted horizontal line in the bottom 
of Figure For sufficiently large ensemble size (A^^e > 150), on average both data misfit 
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and error decrease. Stable computations are obtained when the algorithm is stopped 
([^ with r ~ 1/p. In the top and middle rows of Figure 24 we display estimates 


via 


of the level-set function and the corresponding conductivity obtained with Algorithm [T] 
(for Ne = 150) from five experiments with different initial ensemble sizes (with initial 
ensemble generated from the same Gaussian measure). The visual agreement between 
the estimates of conductivity and truth (Figure bottom) is remarkable. 







\ 1 

_ 





Figure 23. Error with respect to the truth (top) and log data misfit (bottom) 
from 40 experiments with different initial ensembles of size (from left to right) N)e = 
75,100,150,250,350. 


5.2. EIT. Sharp interfaces. 


In this subsection we consider again the EIT problem but assume that we are now 
interested in recovering a conductivity with sharp discontinuities. Let us then consider 


the conductivity and the electrode configuration displayed in Figure 25 (left). We use 
this field as the true conductivity that we aim at estimating with the ensemble Kalman 


method by using the level-set parametrization of expression (58). Synthetic data are 
generated (and avoiding inverse crimes) as described in subsection 4.2 Some true 


voltages are displayed in Figure 25 (middle and right panels). Level-set approaches 
for shape identification in EIT problems have been studied, for example, in [HI l?7] . 
However, these approaches are variational and use a level-set equation for the evolution 
of the shape. 

As in the preceding subsection, we postulate an artificial Gaussian measure A^(0, C) 
that we use to generate an initial ensemble. We consider again C described by 
(56). However, note that for this subsection, such Gaussian is used to generate 


the ensemble of level-set function rather than conductivities. Nonetheless, as in the 


experiments of subsection 4.2, we expect the parameter L to have an influence on the 


initial ensemble and this the estimate of the level-set and the associated conductivity 


obtained with (58). In Figure 26 we present some samples of the level-set function 
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Figure 24. Top: Estimates of level-set obtained from 5 experiments with different initial 

Middle: Conductivities computed from 
)). Bottom: True hydraulic conductivity. 


ensembles with samples of A^(0, C) with C from (54) 
these estimated level-set functions (by means of (58 



Figure 25. Top-left: true conductivity. Top-middle and top-right: two of the 15 voltage 
patterns generated with the true conductivity. 


(top) and the corresponding condnctivity (bottom) obtained with different valnes of 
L {L = 0.2,0.1,0.06,0.04,0.03) bnt with the same set of KL coefficients. We clearly 
observe that the correlation length of the level-set is reflected in the spatial correlation 
of the interface between the regions of different condnctivities. In Fignre we show 
estimates of level-set fnnction (top) and condnctivities (bottom) obtained, by means of 
Algorithmic with an initial ensemble of W = 200 elements generated from the Gaussian 
distribution described above with the aforementioned values of L. We clearly observe 
that there is an optimal choice of L which yields estimates that visually agree better 
with the truth displayed in Figure 27 (bottom). In this case, 0.06 < L < 0.1 provides 
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(on average) the lowest error w.r.t.truth. In Figure [2^ we show the relative error (top) 
and the log-data misht (bottom) from different experiments corresponding to multiple 
initial ensembles generated with the Gaussian A^(0,C') for the aforementioned choices 
of L. For larger correlation lengths L > 0.2 the data misht displays larger huctuations 
that seem to arise from the appearance of high conductivity close to the electrodes. 
In addition, very small values of L yields conductivities where the interface between 
different values has short correlation length. In the current framework, knowledge of the 
optimal correlation length can be used for the generation of initial ensemble. However, 
for more general/realistic cases where such parameter is unknown, the estimation of 
L should be conducted within the method; this is beyond the scope of the present 
manuscript. 



Figure 26. Top: Level-set functions sampled of N{t), C) with values of L in (56) (from left to 
right) L = 0.2,0.1,0.06,0.04,0.03 (these samples have the same realization of KL coefficients). 
Bottom: Conductivities computed from these sample level-set functions (by means of (58)). 


6. Conclusions 

Our numerical investigation indicates that for sufficiently large ensemble size, the 
proposed regularizing ensemble Kalman method inherits the regularizing properties 
of the LM scheme na. In other words, when is large enough, we observe that 
the proposed selection of the regularization parameter and early termination of the 
scheme prevent the lack of stability typical of unregularized schemes. Fortunately, 
the aforementioned size needed for stability is reasonable and often used in standard 
ensemble implementations for large-scale applications. For the Darcy problem, for 
example, we found that with = 150 stable and reasonably accurate estimates can 
be obtained, in average, within 12 iterations of the scheme when p = 0.7 (and with 
M = 100 measurements). The computational cost for computing these estimates is 1800 
forward model simulations which is comparable with other gradient-based techniques 
where also dozens of iterations are need for the convergence, but at each iteration, 
the explicit computation of the Frechet derivative (if possible at all) has often the 
cost of M linearized forward model simulations (recall M is the total number of 
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Figure 27. Top: Estimates of level-set obtained from 5 experiments with different 
initial ensembles with samples of N{0,C) with values of L in (56) (from left to right) 
L = 0.2,0.1,0.06,0.04,0.03. Middle: Conductivities computed from these estimated level- 
set functions (by means of (58)). Bottom: True conductivity. 








Figure 28. Error with respect to the truth (top) and log data misfit (bottom) from 40 
experiments with different initial ensembles generated from samples of A^(0, C) with values of 
L in (56) (from left to right) L = 0.2,0.1,0.06,0.04,0.03. 


measurements). From the results of the preceding section we can clearly appreciate 
that the computational cost and accuracy of the proposed method with large enough 
Ng is very competitive with variational iterative regularization methods. However, as 
we have consistently reiterated throughout this manuscript, the main advantage of the 
proposed method is that no derivatives of the forward map are needed for the proposed 
ensemble regularizing scheme. 

For the FIT problem, we found that a reasonable ensemble size Ne = 100 is 
sufficient for the method to display stability in the case of 16 electrodes considered 
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in the subsection |2.3.2| and for wide class of tunable parameters. However, we observe 
that the selection of the initial ensemble is crucial for the stability and accuracy of 
the proposed scheme. In concrete, there is a range for the optimal selection of the 
spatial correlation length of the members of the initial ensemble. As we expected, 
good approximations of the truth are obtained if the samples that we use for the 
generation of the initial ensemble have a correlation length similar to the one of the 
true conductivity. While prior knowledge of the regularity and spatial features from the 
truth could be incorporated for the selection of the initial ensemble, we recognize that 
in more general cases such information may not be available. Further research should 
address the estimation of the parameters that control the regularity/spatial features of 
the initial ensemble. 

The numerical experiments from the preceding section show that the proposed 
method is computationally flexible and suitable for a wide class of parameterizations of 
PDF parameters. We demonstrated that the regularizing ensemble Kalman method can 
be successfully used for the identification of shapes where the geometry is parameterized 
with a level-set function. The advantage of having a regularized ensemble method 
has been showcased with the examples of Section where we observe that, with a 
reasonable ensemble size, the proposed method provides stable computations of the 
unknown regions of high conductivity. The regularization properties and our selection 
of the initial ensemble generated from a Gaussian distribution is key for developing a 
robust level-set scheme that avoids solving the level-set equation for the evolution of the 
shape; our scheme induces the motion of the shape in a controlled fashion. 

Standard ensemble Kalman methods have been numerously applied for solving iden¬ 
tification problems where the forward map arises from a large-scale forward model. 
However, these standard implementations are often unstable in particular when small 
ensemble sizes (like the ones used here) are used. We have shown that the proposed reg¬ 
ularizing ensemble method, when the ensemble size is large enough, not only addresses 
the ill-poseness proper of the inverse problem but also the one associated to the small 
size effect. There is a broad class of ensemble methods where the Frechet derivative of 
the forward map is approximated with an ensemble [10]. While most of these methods 
have been developed to address concrete applications in a statistical setting, they could 
be potentially applied for the solution of generic parameter identification problems if 
these methods are empowered with the regularization/stabilization needed to compute 
the solution of such inverse problems. This manuscript offers numerical evidence that 
importing ideas from iterative regularization methods can potentially use with ensemble 
methods to stabilize/regularize the computations of inverse estimates. 
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7. 1. Proof of Lemma 3.1 


Note that if X is a finite dimensional, DG{un), D*G{un) and G in (21) are matrices. 
Moreover, since in this case G is invertible (recall G is positive dehnite) simple 
matrix algebra can be applied to show the desired eqnivalence [33]. When X is an 
infinite-dimensional space the argnment for matrices can no longer be applied. In this 
case, however, we can nse a representer-based argnment. First we note that, since 
Y = DG{un) : X ^ Y has the form DG{un)v = [DGi{un)v ,..., DMG{un)v] with 
DjG{un) : X —M for all j G {1,..., M}. Moreover, we note that 

wDGm{Un)v = {v,DmG*{Un)w)x (60) 

for all tc G M and n G X. Therefore, 

M M 

(w, L)G(Uji'jv')^M ^ ^ WjyiD^G(Uji)v ^ ^ dDjn,G (Un)^mg 

m=l m=l ^ 

and so DG*{un)w = J2m=i ^^G*{un)wm for all w G Let us denote {ei}^-^ the 

canonical basis in Note that 


and so 


DG{un)G DG*{un)ei = DG{un) G DG*{un)l 
= [D,G{u^)CDG:{ur,)l, .. .,DMG{un)GDG*{un)l] 

ejDG{un)GDG*{un)ei = DjG{un) G DiG*{un) 


Let G X be the unique “representer” such that 


(61) 

(62) 

(63) 


for all n G X. On the other hand, from (60) we have DmG{un)v = {v, DG"^{un)l)x for 
all m G {1,..., M}, then G~^rm = DG^{un)l and so, formally, 

r™ = GDG*^{un)l (64) 

We may now consider the following representation in terms of representers 

M 

V = '^ fdmrm + b (65) 

m=l 


where b T rj in X for all j G {1,..., M} (i.e. {G G ^^‘^b)x)- From (64) and (65) 

we have 

M M 

DG{Un)v = ^ (3rnDG{Un)rm + DG{un)b = /3mDG{Un)GDlf,G{Un)l + DG{un)b{66) 

m=l j=l 

Note that DmG{un)b = {G~^l’^rm, G~^/%)x = 0. Therefore, we arrive at 

M 

DG{Ur,)v = Y,PmDG{Un)GDmG*{Un)l = DG{Un)GDG*{Un)/3 (67) 
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where /? = (/3i,... Similarly, from (62)- ( p35| ) we obtain 

jm 

= J2PjPmD^G{un)r, + = P^DG{un)CDG%Un)P + 11^-1/2611^ (68) 


jm 


Combining (67) and ( |68| ) we write 

J{v) = ||r-i/2(y'^ - G{un)) - DG{un)v\\l + a\\G-^G^\ 

as 


JLM{P,b) = \ \T-^y - G{Un) - DG{Un)GDG*{Ur 


X 


+a/3^DG{un)GDG*{un)/3 + 011 ^- 1/2611 


X 


(69) 


From the linearly independece of {DmG{un)}m=iJ is not difficnlt to see that the 
matrix DG{un)GDG*{un) is invertible. Then, from standard argnments it follows that 
the nniqne minimizer of (69) in x (span{rm}m=i)''' is 

/3 = [DG{un)GDG*{un) + ar]-i(2/ - G(nO), 


6 = 0 


(70) 


Expression (24) then follows from writing Un+i — Un = v and using (64)-(65) and (70). 

□ 
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